Exploiting regularity in Cyc for text generation

نویسندگان

  • François Pachet
  • Hafedh Mili
چکیده

In order to provide hypertext with text generation facilities we combine hypertext with knowledge bases and create expertext systems. This paper studies a particular characteristics of hierarchical semantic nets called regularity, which has proved to be a very valuable tool for expertext construction and use. We show how regularity can be transposed in the context of the Cyc knowledge base to provide an operational tool for understanding the knowledge base intimate topology. Using radical and provoking hypothesis i) general if-then rules express irregularities, and ii) "the more regular, the less interesting", we associate irregularity to text units relevance, to assist expertext users in a number of way, including finding relevant places to input text, helping formulate reading or writing plans, understand the overall system, and suggesting modifications to the knowledge base designers. 1. From hypertext to expertext The combination of hypertext technologies with expert systems techniques is called "expertext system" [Rada 89]. Following [Streitz 89] and [Wand & al 91], we think that writing is a complex problem solving activity whose production can be viewed as an external representation of internal knowledge structures of the author. This study is part of a general study on building expertext from existing knowledge bases, aimed at studying the nature of writing plans [Rada & Barlow 89] as hypertext traversal strategies [Mili&Rada 88, 90a]. We use the Cyc knowledge base [Lenat & Guha 90] as a testbed for experimenting with our techniques, and as a basis for the construction of expertext. This paper is based on two observations resulting from the study of the Cyc system in the context of expertext construction. The Cyc system contains an enormous amount of information and intelligence but uses a variety of knowledge representation techniques which make it hard to understand at a global level. By trying to use the system and manage its complexity, we came to a paradoxical observation concerning the nature of the knowledge embodied in Cyc, i.e., that certain types of knowledge tend to increase regularity, in a precise and computable sense, whereas others tend to break those regularities. Applying regularity calculus to Cyc, the second observation is that for most relations explicitly present in the system, "the more regular" a relation, the less interesting it is as a potential text unit holder. The paper is organized as follows. We will first introduce briefly the Cyc system and its main characteristics with regard to expertext construction, and recall the definition and main results of the regularity calculus as defined in hierarchical semantic nets. We will then apply the notion of regularity to the Cyc system, and show how various inference patterns of the system have corresponding regularity properties. Finally we discuss the application of the regularity hypothesis to expertext construction.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Chapter 8 Regularity , Document Generation , and Cyc

We are interested in building and maintaining semantic nets in general, and hierarchical semantic nets in particular. In [Mili, 1988], we developed a model of hierarchical semantic nets that generalizes taxonomic models by replacing the concept of property inheritance by a more general behavior of properties that we called regularity [Mili & Rada, 1990a]. We are also interested in the generatio...

متن کامل

Exploiting text for extracting image processing resources

Much everyday knowledge about physical aspects of objects does not exist as computer data, though such computer-based knowledge will be needed to communicate with next generation voice-commanded personal robots as well in other applications involving visual scene recognition. The largest attempt at manually creating common-sense knowledge, the CYC project, has not yet produced the information n...

متن کامل

On the Application of the Cyc Ontology to Word Sense Disambiguation

This paper describes a novel, unsupervised method of word sense disambiguation that is wholly semantic, drawing upon a complex, rich ontology and inference engine (the Cyc system). This method goes beyond more familiar semantic closeness approaches to disambiguation that rely on string cooccurrence or relative location in a taxonomy or concept map by 1) exploiting a rich array of properties, in...

متن کامل

Castelnuovo-Mumford regularity of products of monomial ideals

Let $R=k[x_1,x_2,cdots, x_N]$ be a polynomial ring over a field $k$. We prove that for any positive integers $m, n$, $text{reg}(I^mJ^nK)leq mtext{reg}(I)+ntext{reg}(J)+text{reg}(K)$ if $I, J, Ksubseteq R$ are three monomial complete intersections ($I$, $J$, $K$ are not necessarily proper ideals of the polynomial ring $R$), and $I, J$ are of the form $(x_{i_1}^{a_1}, x_{i_2}^{a_2}, cdots, x_{i_l...

متن کامل

Computing semantic relatedness of words and texts in Wikipedia-derived semantic space

Adequate representation of natural language semantics requires access to vast amounts of common sense and domain-specific world knowledge. Prior work in the field was either based on purely statistical techniques that did not make use of background knowledge or on huge manual efforts, such as the CYC projects. Here we propose a novel method, called Explicit Semantic Analysis (ESA), for finegrai...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995